AITopics | topic segmentation

Collaborating Authors

topic segmentation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Def-DTS: Deductive Reasoning for Open-domain Dialogue Topic Segmentation

Lee, Seungmin, Yoo, Yongsang, Jung, Minhwa, Song, Min

arXiv.org Artificial IntelligenceMay-28-2025

Dialogue Topic Segmentation (DTS) aims to divide dialogues into coherent segments. DTS plays a crucial role in various NLP downstream tasks, but suffers from chronic problems: data shortage, labeling ambiguity, and incremental complexity of recently proposed solutions. On the other hand, Despite advances in Large Language Models (LLMs) and reasoning strategies, these have rarely been applied to DTS. This paper introduces Def-DTS: Deductive Reasoning for Open-domain Dialogue Topic Segmentation, which utilizes LLM-based multi-step deductive reasoning to enhance DTS performance and enable case study using intermediate result. Our method employs a structured prompting approach for bidirectional context summarization, utterance intent classification, and deductive topic shift detection. In the intent classification process, we propose the generalizable intent list for domain-agnostic dialogue intent classification. Experiments in various dialogue settings demonstrate that Def-DTS consistently outperforms traditional and state-of-the-art approaches, with each subtask contributing to improved performance, particularly in reducing type 2 error. We also explore the potential for autolabeling, emphasizing the importance of LLM reasoning techniques in DTS.

computational linguistic, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2505.21033

Country:

Asia (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Towards Conditioning Clinical Text Generation for User Control

Koraş, Osman Alperen, Bahnan, Rabi, Kleesiek, Jens, Dada, Amin

arXiv.org Artificial IntelligenceFeb-24-2025

Deploying natural language generation systems in clinical settings remains challenging despite advances in Large Language Models (LLMs), which continue to exhibit hallucinations and factual inconsistencies, necessitating human oversight. This paper explores automated dataset augmentation using LLMs as human proxies to condition LLMs for clinician control without increasing cognitive workload. On the BioNLP ACL'24 Discharge Me! Shared Task, we achieve new state-of-the-art results with simpler methods than prior submissions through more efficient training, yielding a 9\% relative improvement without augmented training and up to 34\% with dataset augmentation. Preliminary human evaluation further supports the effectiveness of our approach, highlighting the potential of augmenting clinical text generation for control to enhance relevance, accuracy, and factual consistency.

discharge summary, guideline, text generation, (14 more...)

arXiv.org Artificial Intelligence

2502.17571

Country:

Asia > Thailand > Bangkok > Bangkok (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
(10 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Health Care Providers & Services (0.94)
Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Health & Medicine > Health Care Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Recent Trends in Linear Text Segmentation: a Survey

Ghinassi, Iacopo, Wang, Lin, Newell, Chris, Purver, Matthew

arXiv.org Artificial IntelligenceNov-25-2024

Linear Text Segmentation is the task of automatically tagging text documents with topic shifts, i.e. the places in the text where the topics change. A well-established area of research in Natural Language Processing, drawing from well-understood concepts in linguistic and computational linguistic research, the field has recently seen a lot of interest as a result of the surge of text, video, and audio available on the web, which in turn require ways of summarising and categorizing the mole of content for which linear text segmentation is a fundamental step. In this survey, we provide an extensive overview of current advances in linear text segmentation, describing the state of the art in terms of resources and approaches for the task. Finally, we highlight the limitations of available resources and of the task itself, while indicating ways forward based on the most recent literature and under-explored research directions.

computational linguistic, segmentation, text segmentation, (14 more...)

arXiv.org Artificial Intelligence

2411.16613

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Asia > Singapore (0.04)
Europe > Bulgaria (0.04)
(9 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.94)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Advancing Topic Segmentation of Broadcasted Speech with Multilingual Semantic Embeddings

Shukla, Sakshi Deo, Denisov, Pavel, Turan, Tugtekin

arXiv.org Artificial IntelligenceSep-10-2024

Recent advancements in speech-based topic segmentation have highlighted the potential of pretrained speech encoders to capture semantic representations directly from speech. Traditionally, topic segmentation has relied on a pipeline approach in which transcripts of the automatic speech recognition systems are generated, followed by text-based segmentation algorithms. In this paper, we introduce an end-to-end scheme that bypasses this conventional two-step process by directly employing semantic speech encoders for segmentation. Focused on the broadcasted news domain, which poses unique challenges due to the diversity of speakers and topics within single recordings, we address the challenge of accessing topic change points efficiently in an end-to-end manner. Furthermore, we propose a new benchmark for spoken news topic segmentation by utilizing a dataset featuring approximately 1000 hours of publicly available recordings across six European languages and including an evaluation set in Hindi to test the model's cross-domain performance in a cross-lingual, zero-shot scenario. This setup reflects real-world diversity and the need for models adapting to various linguistic settings. Our results demonstrate that while the traditional pipeline approach achieves a state-of-the-art $P_k$ score of 0.2431 for English, our end-to-end model delivers a competitive $P_k$ score of 0.2564. When trained multilingually, these scores further improve to 0.1988 and 0.2370, respectively. To support further research, we release our model along with data preparation scripts, facilitating open research on multilingual spoken news topic segmentation.

dataset, segmentation, topic segmentation, (15 more...)

arXiv.org Artificial Intelligence

2409.06222

Country:

Europe > Germany (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Asia > India (0.04)

Genre: Research Report > New Finding (0.86)

Industry:

Leisure & Entertainment (0.68)
Media > Radio (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multimodal Fusion and Coherence Modeling for Video Topic Segmentation

Yu, Hai, Deng, Chong, Zhang, Qinglin, Liu, Jiaqing, Chen, Qian, Wang, Wen

arXiv.org Artificial IntelligenceAug-1-2024

Also, coherence is essential for data (Koshorek et al., 2018; Arnold et al., 2019), understanding logical structures and semantics. Enhancing contemporary supervised models (Lukasik et al., coherence modeling has achieved significant 2020; Somasundaran et al., 2020; Zhang et al., improvements in long document topic segmentation 2021; Yu et al., 2023) have demonstrated superior (Yu et al., 2023). Therefore, we improve performance compared to unsupervised approaches supervised VTS methods by thoroughly exploring (Riedl and Biemann, 2012; Solbiati et al., multimodal fusion and multimodal coherence 2021). Notably, supervised models that excel at modeling. We enhance multimodal fusion modeling long sequences (Zhang et al., 2021; Yu from the perspectives of model architecture and et al., 2023) are capable of capturing longer contextual pre-training and fine-tuning tasks. Specifically, we nuances and thereby achieve better topic segmentation compare various multimodal fusion architectures performance, compared to models that built upon Cross-Attention and Mixture-of-Experts model local sentence pairs or block pairs (Wang (MoE). We investigate the effect of multi-modal et al., 2017; Lukasik et al., 2020). In addition, contrastive learning for general pre-training and recent works (Somasundaran et al., 2020; Xing fine-tuning for strengthening cross-modal alignment.

proceedings, segmentation, topic segmentation, (15 more...)

arXiv.org Artificial Intelligence

2408.00365

Country: Asia (0.04)

Genre:

Research Report (0.82)
Instructional Material > Course Syllabus & Notes (0.46)

Industry: Education > Educational Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

TreeSeg: Hierarchical Topic Segmentation of Large Transcripts

Gklezakos, Dimitrios C., Misiak, Timothy, Bishop, Diamond

arXiv.org Artificial IntelligenceJun-28-2024

From organizing recorded videos and meetings into chapters, to breaking down large inputs in order to fit them into the context window of commoditized Large Language Models (LLMs), topic segmentation of large transcripts emerges as a task of increasing significance. Still, accurate segmentation presents many challenges, including (a) the noisy nature of the Automatic Speech Recognition (ASR) software typically used to obtain the transcripts, (b) the lack of diverse labeled data and (c) the difficulty in pin-pointing the ground-truth number of segments. In this work we present TreeSeg, an approach that combines off-the-shelf embedding models with divisive clustering, to generate hierarchical, structured segmentations of transcripts in the form of binary trees. Our approach is robust to noise and can handle large transcripts efficiently. We evaluate TreeSeg on the ICSI and AMI corpora, demonstrating that it outperforms all baselines. Finally, we introduce TinyRec, a small-scale corpus of manually annotated transcripts, obtained from self-recorded video sessions.

partition, segmentation, treeseg, (15 more...)

arXiv.org Artificial Intelligence

2407.12028

Country:

Asia > Singapore (0.04)
North America > United States (0.04)
Europe > Switzerland (0.04)
(3 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.86)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.54)

Add feedback

Unsupervised Mutual Learning of Dialogue Discourse Parsing and Topic Segmentation

Xu, Jiahui, Jiang, Feng, Gao, Anningzhe, Li, Haizhou

arXiv.org Artificial IntelligenceJun-3-2024

The advancement of large language models (LLMs) has propelled the development of dialogue systems. Unlike the popular ChatGPT-like assistant model, which only satisfies the user's preferences, task-oriented dialogue systems have also faced new requirements and challenges in the broader business field. They are expected to provide correct responses at each dialogue turn, at the same time, achieve the overall goal defined by the task. By understanding rhetorical structures and topic structures via topic segmentation and discourse parsing, a dialogue system may do a better planning to achieve both objectives. However, while both structures belong to discourse structure in linguistics, rhetorical structure and topic structure are mostly modeled separately or with one assisting the other in the prior work. The interaction between these two structures has not been considered for joint modeling and mutual learning. Furthermore, unsupervised learning techniques to achieve the above are not well explored. To fill this gap, we propose an unsupervised mutual learning framework of two structures leveraging the global and local connections between them. We extend the topic modeling between non-adjacent discourse units to ensure global structural relevance with rhetorical structures. We also incorporate rhetorical structures into the topic structure through a graph neural network model to ensure local coherence consistency. Finally, we utilize the similarity between the two fused structures for mutual learning. The experimental results demonstrate that our methods outperform all strong baselines on two dialogue rhetorical datasets (STAC and Molweni), as well as dialogue topic datasets (Doc2Dial and TIAGE). We provide our code at https://github.com/Jeff-Sue/URT.

discourse, matrix, topic structure, (12 more...)

arXiv.org Artificial Intelligence

2405.19799

Country:

Asia > China > Guangdong Province > Shenzhen (0.05)
Asia > China > Hong Kong (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

With a Little Help from my (Linguistic) Friends: Topic Segmentation of Multi-party Casual Conversations

Decker, Amandine, Amblard, Maxime

arXiv.org Artificial IntelligenceFeb-5-2024

Topics play an important role in the global organisation of a conversation as what is currently discussed constrains the possible contributions of the participant. Understanding the way topics are organised in interaction would provide insight on the structure of dialogue beyond the sequence of utterances. However, studying this high-level structure is a complex task that we try to approach by first segmenting dialogues into smaller topically coherent sets of utterances. Understanding the interactions between these segments would then enable us to propose a model of topic organisation at a dialogue level. In this paper we work with open-domain conversations and try to reach a comparable level of accuracy as recent machine learning based topic segmentation models but with a formal approach. The features we identify as meaningful for this task help us understand better the topical structure of a conversation.

segmentation, utterance, xing and carenini, (14 more...)

arXiv.org Artificial Intelligence

2402.02837

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)
Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.04)
(15 more...)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment (0.68)
Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.46)

Add feedback

Segmenting Messy Text: Detecting Boundaries in Text Derived from Historical Newspaper Images

Anderson, Carol, Crone, Phil

arXiv.org Artificial IntelligenceDec-20-2023

Text segmentation, the task of dividing a document into sections, is often a prerequisite for performing additional natural language processing tasks. Existing text segmentation methods have typically been developed and tested using clean, narrative-style text with segments containing distinct topics. Here we consider a challenging text segmentation task: dividing newspaper marriage announcement lists into units of one announcement each. In many cases the information is not structured into sentences, and adjacent segments are not topically distinct from each other. In addition, the text of the announcements, which is derived from images of historical newspapers via optical character recognition, contains many typographical errors. As a result, these announcements are not amenable to segmentation with existing techniques. We present a novel deep learning-based model for segmenting such text and show that it significantly outperforms an existing state-of-the-art method on our task.

boundary, linguist, segmentation, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICPR48806.2021.9413279

2312.12773

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Virginia (0.04)
North America > United States > Utah > Utah County > Lehi (0.04)
(7 more...)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.47)

Industry: Media > News (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multi-Modal Video Topic Segmentation with Dual-Contrastive Domain Adaptation

Xing, Linzi, Tran, Quan, Caba, Fabian, Dernoncourt, Franck, Yoon, Seunghyun, Wang, Zhaowen, Bui, Trung, Carenini, Giuseppe

arXiv.org Artificial IntelligenceNov-30-2023

Video topic segmentation unveils the coarse-grained semantic structure underlying videos and is essential for other video understanding tasks. Given the recent surge in multi-modal, relying solely on a single modality is arguably insufficient. On the other hand, prior solutions for similar tasks like video scene/shot segmentation cater to short videos with clear visual shifts but falter for long videos with subtle changes, such as livestreams. In this paper, we introduce a multi-modal video topic segmenter that utilizes both video transcripts and frames, bolstered by a cross-modal attention mechanism. Furthermore, we propose a dual-contrastive learning framework adhering to the unsupervised domain adaptation paradigm, enhancing our model's adaptability to longer, more semantically complex videos. Experiments on short and long video corpora demonstrate that our proposed solution, significantly surpasses baseline methods in terms of both accuracy and transferability, in both intra- and cross-domain settings.

proceedings, segmentation, video, (15 more...)

arXiv.org Artificial Intelligence

2312.0022

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report (0.82)

Industry:

Media (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Vision (0.89)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.47)

Add feedback